AITopics | query processing

Collaborating Authors

query processing

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

0d7363894acdee742caf7fe4e97c4d49-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 11:39:27 GMT

graph, information, keyword, (13 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > New South Wales (0.04)
Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
(2 more...)

Add feedback

CrackSQL: A Hybrid SQL Dialect Translation System Powered by Large Language Models

Zhou, Wei, Gao, Yuyang, Zhou, Xuanhe, Li, Guoliang

arXiv.org Artificial IntelligenceApr-1-2025

Dialect translation plays a key role in enabling seamless interaction across heterogeneous database systems. However, translating SQL queries between different dialects (e.g., from PostgreSQL to MySQL) remains a challenging task due to syntactic discrepancies and subtle semantic variations. Existing approaches including manual rewriting, rule-based systems, and large language model (LLM)-based techniques often involve high maintenance effort (e.g., crafting custom translation rules) or produce unreliable results (e.g., LLM generates non-existent functions), especially when handling complex queries. In this demonstration, we present CrackSQL, the first hybrid SQL dialect translation system that combines rule and LLM-based methods to overcome these limitations. CrackSQL leverages the adaptability of LLMs to minimize manual intervention, while enhancing translation accuracy by segmenting lengthy complex SQL via functionality-based query processing. To further improve robustness, it incorporates a novel cross-dialect syntax embedding model for precise syntax alignment, as well as an adaptive local-to-global translation strategy that effectively resolves interdependent query operations. CrackSQL supports three translation modes and offers multiple deployment and access options including a web console interface, a PyPI package, and a command-line prompt, facilitating adoption across a variety of real-world use cases

large language model, natural language, translation, (17 more...)

arXiv.org Artificial Intelligence

2504.00882

Country: Asia > China > Shanghai > Shanghai (0.05)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Neuro-Symbolic Query Optimization in Knowledge Graphs

Acosta, Maribel, Qin, Chang, Schwabe, Tim

arXiv.org Artificial IntelligenceNov-21-2024

This chapter delves into the emerging field of neuro-symbolic query optimization for knowledge graphs (KGs), presenting a comprehensive exploration of how neural and symbolic techniques can be integrated to enhance query processing. Traditional query optimizers in knowledge graphs rely heavily on symbolic methods, utilizing dataset summaries, statistics, and cost models to select efficient execution plans. However, these approaches often suffer from misestimations and inaccuracies, particularly when dealing with complex queries or large-scale datasets. Recent advancements have introduced neural models, which capture non-linear aspects of query optimization, offering promising alternatives to purely symbolic methods. In this chapter, we introduce neuro-symbolic query optimizers, a novel approach that combines the strengths of symbolic reasoning with the adaptability of neural computation. We discuss the architecture of these hybrid systems, highlighting the interplay between neural and symbolic components to improve the optimizer's ability to navigate the search space and produce efficient execution plans. Additionally, the chapter reviews existing neural components tailored for optimizing queries over knowledge graphs and examines the limitations and challenges in deploying neuro-symbolic query optimizers in real-world environments.

optimizer, query, query optimization, (14 more...)

arXiv.org Artificial Intelligence

2411.14277

Country:

North America > United States > Texas > Dallas County > Dallas (0.04)
North America > United States > Pennsylvania > Northampton County > Bethlehem (0.04)
Europe > Greece > Attica > Athens (0.04)
(3 more...)

Genre:

Research Report (0.70)
Workflow (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)

Add feedback

Self-Compositional Data Augmentation for Scientific Keyphrase Generation

Houbre, Mael, Boudin, Florian, Daille, Beatrice, Aizawa, Akiko

arXiv.org Artificial IntelligenceNov-6-2024

State-of-the-art models for keyphrase generation require large amounts of training data to achieve good performance. However, obtaining keyphrase-labeled documents can be challenging and costly. To address this issue, we present a self-compositional data augmentation method. More specifically, we measure the relatedness of training documents based on their shared keyphrases, and combine similar documents to generate synthetic samples. The advantage of our method lies in its ability to create additional training samples that keep domain coherence, without relying on external data or resources. Our results on multiple datasets spanning three different domains, demonstrate that our method consistently improves keyphrase generation. A qualitative analysis of the generated keyphrases for the Computer Science domain confirms this improvement towards their representativity property.

computational linguistic, keyphrase, proceedings, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3677389.3702504

2411.03039

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
(26 more...)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

User Intent Recognition and Semantic Cache Optimization-Based Query Processing Framework using CFLIS and MGR-LAU

Mahendru, Sakshi

arXiv.org Artificial IntelligenceJun-6-2024

Query Processing (QP) is optimized by a Cloud-based cache by storing the frequently accessed data closer to users. Nevertheless, the lack of focus on user intention type in queries affected the efficiency of QP in prevailing works. Thus, by using a Contextual Fuzzy Linguistic Inference System (CFLIS), this work analyzed the informational, navigational, and transactional-based intents in queries for enhanced QP. Primarily, the user query is parsed using tokenization, normalization, stop word removal, stemming, and POS tagging and then expanded using the WordNet technique. After expanding the queries, to enhance query understanding and to facilitate more accurate analysis and retrieval in query processing, the named entity is recognized using Bidirectional Encoder UnispecNorm Representations from Transformers (BEUNRT). Next, for efficient QP and retrieval of query information from the semantic cache database, the data is structured using Epanechnikov Kernel-Ordering Points To Identify the Clustering Structure (EK-OPTICS). The features are extracted from the structured data. Now, sentence type is identified and intent keywords are extracted from the parsed query. Next, the extracted features, detected intents and structured data are inputted to the Multi-head Gated Recurrent Learnable Attention Unit (MGR-LAU), which processes the query based on a semantic cache database (stores previously interpreted queries to expedite effective future searches). Moreover, the query is processed with a minimum latency of 12856ms. Lastly, the Semantic Similarity (SS) is analyzed between the retrieved query and the inputted user query, which continues until the similarity reaches 0.9 and above. Thus, the proposed work surpassed the previous methodologies.

query, query processing, signify, (15 more...)

arXiv.org Artificial Intelligence

2406.0449

Country: Europe > Switzerland > Basel-City > Basel (0.04)

Genre: Research Report (0.40)

Industry: Information Technology > Services (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.68)

Add feedback

Dynamic Data Layout Optimization with Worst-case Guarantees

Rong, Kexin, Liu, Paul, Sonje, Sarah Ashok, Charikar, Moses

arXiv.org Artificial IntelligenceMay-8-2024

Many data analytics systems store and process large datasets in partitions containing millions of rows. By mapping rows to partitions in an optimized way, it is possible to improve query performance by skipping over large numbers of irrelevant partitions during query processing. This mapping is referred to as a data layout. Recent works have shown that customizing the data layout to the anticipated query workload greatly improves query performance, but the performance benefits may disappear if the workload changes. Reorganizing data layouts to accommodate workload drift can resolve this issue, but reorganization costs could exceed query savings if not done carefully. In this paper, we present an algorithmic framework OReO that makes online reorganization decisions to balance the benefits of improved query performance with the costs of reorganization. Our framework extends results from Metrical Task Systems to provide a tight bound on the worst-case performance guarantee for online reorganization, without prior knowledge of the query workload. Through evaluation on real-world datasets and query workloads, our experiments demonstrate that online reorganization with OReO can lead to an up to 32% improvement in combined query and reorganization time compared to using a single, optimized data layout for the entire workload.

algorithm, data layout, layout, (17 more...)

arXiv.org Artificial Intelligence

2405.04984

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry: Media > Publishing (0.41)

Technology:

Information Technology > Databases (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Add feedback

A Survey of Learned Indexes for the Multi-dimensional Space

Al-Mamun, Abdullah, Wu, Hao, He, Qiyang, Wang, Jianguo, Aref, Walid G.

arXiv.org Artificial IntelligenceMar-11-2024

A recent research trend involves treating database index structures as Machine Learning (ML) models. In this domain, single or multiple ML models are trained to learn the mapping from keys to positions inside a data set. This class of indexes is known as "Learned Indexes." Learned indexes have demonstrated improved search performance and reduced space requirements for one-dimensional data. The concept of one-dimensional learned indexes has naturally been extended to multi-dimensional (e.g., spatial) data, leading to the development of "Learned Multi-dimensional Indexes". This survey focuses on learned multi-dimensional index structures. Specifically, it reviews the current state of this research area, explains the core concepts behind each proposed method, and classifies these methods based on several well-defined criteria. We present a taxonomy that classifies and categorizes each learned multi-dimensional index, and survey the existing literature on learned multi-dimensional indexes according to this taxonomy. Additionally, we present a timeline to illustrate the evolution of research on learned indexes. Finally, we highlight several open challenges and future research directions in this emerging and highly active field.

learned index, multi-dimensional index, proceedings, (14 more...)

arXiv.org Artificial Intelligence

2403.06456

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Virginia > Arlington County > Arlington (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(6 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.92)
(4 more...)

Add feedback

Multi-Agent Join

Ghadakchi, Vahid, Xie, Mian, Termehchy, Arash, Doskenov, Bakhtiyar, Srikhakollu, Bharghav, Haque, Summit, Wang, Huazheng

arXiv.org Artificial IntelligenceDec-21-2023

It is crucial to provide real-time performance in many applications, such as interactive and exploratory data analysis. In these settings, users often need to view subsets of query results quickly. It is challenging to deliver such results over large datasets for relational operators over multiple relations, such as join. Join algorithms usually spend a long time on scanning and attempting to join parts of relations that may not generate any result. Current solutions usually require lengthy and repeated preprocessing, which is costly and may not be possible to do in many settings. Also, they often support restricted types of joins. In this paper, we outline a novel approach for achieving efficient join processing in which a scan operator of the join learns during query execution, the portions of its relations that might satisfy the join predicate. We further improve this method using an algorithm in which both scan operators collaboratively learn an efficient join execution strategy. We also show that this approach generalizes traditional and non-learning methods for joining. Our extensive empirical studies using standard benchmarks indicate that this approach outperforms similar methods considerably.

algorithm, relation, tuple, (17 more...)

arXiv.org Artificial Intelligence

2312.14291

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Austria > Vienna (0.14)
(15 more...)

Genre: Research Report (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Databases (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

To Draw Is Human: Toward No-Code Subgraph Search

Communications of the ACMJun-23-2023, 02:50:48 GMT

Due to the worldwide shortage of developers, growing talent gap, and budgetary challenges faced by small- and medium-sized businesses in hiring software teams, low-code or no-code frameworks are the latest disruption in the business world.1 For example, SAP recently launched SAP AppGyver, which is a "no-code application development platform that enables developers of all skill levels to create enterprise-ready applications with drag-and-drop simplicity."5 The demand for such low-code or no-code frameworks is not limited to software applications development but also for easy access and search of data residing in databases. Specifically, lay users should be able to access them without needing to write a single line of code. However, query languages (QL)--the primary means to access data residing in databases--enforce end users to be proficient in these languages before they can take advantage of databases for their tasks.

no-code framework, paradigm, query formulation, (13 more...)

Communications of the ACM

Country:

Asia > China > Hong Kong (0.06)
Asia > Singapore (0.05)

Technology:

Information Technology > Information Management (0.75)
Information Technology > Artificial Intelligence > Natural Language (0.54)
Information Technology > Human Computer Interaction > Interfaces (0.34)

Add feedback

Knowledge Graphs Querying

Khan, Arijit

arXiv.org Artificial IntelligenceMay-23-2023

Knowledge graphs (KGs) such as DBpedia, Freebase, YAGO, Wikidata, and NELL were constructed to store large-scale, real-world facts as (subject, predicate, object) triples -- that can also be modeled as a graph, where a node (a subject or an object) represents an entity with attributes, and a directed edge (a predicate) is a relationship between two entities. Querying KGs is critical in web search, question answering (QA), semantic search, personal assistants, fact checking, and recommendation. While significant progress has been made on KG construction and curation, thanks to deep learning recently we have seen a surge of research on KG querying and QA. The objectives of our survey are two-fold. First, research on KG querying has been conducted by several communities, such as databases, data mining, semantic web, machine learning, information retrieval, and natural language processing (NLP), with different focus and terminologies; and also in diverse topics ranging from graph databases, query languages, join algorithms, graph patterns matching, to more sophisticated KG embedding and natural language questions (NLQs). We aim at uniting different interdisciplinary topics and concepts that have been developed for KG querying. Second, many recent advances on KG and query embedding, multimodal KG, and KG-QA come from deep learning, IR, NLP, and computer vision domains. We identify important challenges of KG querying that received less attention by graph databases, and by the DB community in general, e.g., incomplete KG, semantic matching, multimodal data, and NLQs. We conclude by discussing interesting opportunities for the data management community, for instance, KG as a unified data model and vector-based query processing.

information retrieval, machine learning, question answering, (23 more...)

arXiv.org Artificial Intelligence

2305.14485

Country:

North America > United States > California (0.04)
Europe > Denmark > North Jutland > Aalborg (0.04)
Asia > Vietnam (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

Add feedback